UF HemBank 1852 case study#
Epigenomic analysis#
UF HemBank 1852 (20x coverage)#
Est. bases: 63Gb
Total CpGs: 28,983,095
Overlapping CpGs: 331,886
Est. sequencing time: >5000min
Show code cell source
import pandas as pd
import sys
sys.path.append('../')
from source.bokeh_plots import *
from source.data_visualization import *
output_notebook()
mount = '/mnt/e/'
input_path = mount + 'MethylScore_v2/Processed_Data/'
test_sample_name = 'uf_hembank_1852'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
UF HemBank 1852 (0.0001x coverage)#
Est. bases: ~300Kb
Total CpGs: 3248
Overlapping CpGs: 27
Est. sequencing time: <2min
test_sample_name = 'uf_hembank_1852_00001x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(AML with mutated NPM1) | |
|---|---|---|---|---|
| uf_hembank_1852_00001x | High | 0.676 | AML with mutated NPM1 | 0.72 |
UF HemBank 1852 (0.001x coverage)#
Est. bases: ~3Mb
Total CpGs: 28865
Overlapping CpGs: 346
Est. sequencing time: 2.5min
test_sample_name = 'uf_hembank_1852_0001x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(Otherwise-Normal Control) | |
|---|---|---|---|---|
| uf_hembank_1852_0001x | High | 0.561 | Otherwise-Normal Control | 0.995 |
UF HemBank 1852 0.01x coverage#
Est. bases: ~30Mb
Total CpGs: 274,747
Overlapping CpGs: 2,571
Est. sequencing time: 6min
test_sample_name = 'uf_hembank_1852_001x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(AML with NUP98-fusion) | |
|---|---|---|---|---|
| uf_hembank_1852_001x | High | 0.755 | AML with NUP98-fusion | 0.934 |
UF HemBank 1852 0.1x coverage#
Est. bases: ~300Mb
Total CpGs: 2,606,667
Overlapping CpGs: 27,376
Est. sequencing time: 26min
test_sample_name = 'uf_hembank_1852_01x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(AML with NUP98-fusion) | |
|---|---|---|---|---|
| uf_hembank_1852_01x | High | 0.77 | AML with NUP98-fusion | 0.944 |
UF HemBank 1852 1x coverage#
Est. bases: ~3Gb
Total CpGs: 17,208,161
Overlapping CpGs: 190,119
Est. sequencing time: 143min
test_sample_name = 'uf_hembank_1852_1x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(AML with NUP98-fusion) | |
|---|---|---|---|---|
| uf_hembank_1852_1x | High | 0.698 | AML with NUP98-fusion | 0.944 |
UF HemBank 1852 (10x coverage)#
Est. bases: ~30Gb
Total CpGs: 28,875,278
Overlapping CpGs: 331,375
Est. sequencing time: 1575min
test_sample_name = 'uf_hembank_1852_10x'
df_nanopore = pd.read_pickle(input_path + test_sample_name + '_processed.pkl')
plot_linked_scatters(df_nanopore, table=False, test_sample=test_sample_name,
xaxis = "PaCMAP 1 of 2", yaxis = "PaCMAP 2 of 2",
cols=['WHO 2022 Diagnosis'])
df_nanopore.iloc[-1:,:][['AML Epigenomic Risk', 'AML Epigenomic Risk P(High Risk)',\
'AL Epigenomic Phenotype', f'P({df_nanopore.iloc[-1:,:]["AL Epigenomic Phenotype"].item()})']]
| AML Epigenomic Risk | AML Epigenomic Risk P(High Risk) | AL Epigenomic Phenotype | P(AML with NUP98-fusion) | |
|---|---|---|---|---|
| uf_hembank_1852_10x | High | 0.711 | AML with NUP98-fusion | 0.926 |
Genomic analysis#
Insertion upstream of NUP98 TSS#
